Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modify summarise_draws internals not to coerce values to same type #356

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

n-kall
Copy link
Collaborator

@n-kall n-kall commented Mar 15, 2024

Summary

This would fix #355.

As it is affecting summarise_draws, which is one of the main functions in posterior, I think this change needs to be considered carefully. It might be less efficient than currently implemented, so I am opening this PR now to check how the benchmarks for summarise_draws perform.

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses:

Copy link

This is how benchmark results would change (along with a 95% confidence interval in relative change) if eca0fae is merged into master:

  •   :ballot_box_with_check:as_draws_array: 136ms -> 136ms [-0.51%, +0.83%]
  •   :rocket:as_draws_df: 64.6ms -> 61ms [-6.83%, -4.58%]
  •   :ballot_box_with_check:as_draws_list: 157ms -> 156ms [-1.13%, +0.17%]
  •   :ballot_box_with_check:as_draws_matrix: 58.8ms -> 58.7ms [-0.97%, +0.57%]
  •   :ballot_box_with_check:as_draws_rvars: 80ms -> 80.4ms [-0.49%, +1.55%]
  • ❗🐌summarise_draws_100_variables: 706ms -> 715ms [+0.92%, +1.51%]
  • ❗🐌summarise_draws_10_variables: 105ms -> 107ms [+0.93%, +2.1%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@n-kall
Copy link
Collaborator Author

n-kall commented Mar 15, 2024

Looks like a >1% increase to summarise_draws times, which I don't think is acceptable

@paul-buerkner
Copy link
Collaborator

How much increase exactly? 1% would barely matter I think

@codecov-commenter
Copy link

codecov-commenter commented Mar 15, 2024

Codecov Report

Attention: Patch coverage is 93.75000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 95.33%. Comparing base (9de9857) to head (a28483f).

❗ Current head a28483f differs from pull request most recent head ded8b04. Consider uploading reports for the commit ded8b04 to get more accurate results

Files Patch % Lines
R/misc.R 90.90% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master     #356   +/-   ##
=======================================
  Coverage   95.33%   95.33%           
=======================================
  Files          50       50           
  Lines        3855     3860    +5     
=======================================
+ Hits         3675     3680    +5     
  Misses        180      180           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@n-kall
Copy link
Collaborator Author

n-kall commented Mar 15, 2024

  • ❗🐌summarise_draws_100_variables: 706ms -> 715ms [+0.92%, +1.51%]

  • ❗🐌summarise_draws_10_variables: 105ms -> 107ms [+0.93%, +2.1%]

Based on these ^

Copy link

This is how benchmark results would change (along with a 95% confidence interval in relative change) if a28483f is merged into master:

  •   :ballot_box_with_check:as_draws_array: 137ms -> 138ms [-0.57%, +1.33%]
  • ❗🐌as_draws_df: 64.7ms -> 66.8ms [+1.45%, +5.05%]
  •   :ballot_box_with_check:as_draws_list: 163ms -> 164ms [-0.39%, +2%]
  • ❗🐌as_draws_matrix: 60.5ms -> 63.6ms [+3.63%, +6.84%]
  •   :ballot_box_with_check:as_draws_rvars: 81.6ms -> 81.1ms [-1.61%, +0.46%]
  • ❗🐌summarise_draws_100_variables: 709ms -> 719ms [+0.98%, +1.77%]
  • ❗🐌summarise_draws_10_variables: 108ms -> 110ms [+0.75%, +2.83%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@n-kall
Copy link
Collaborator Author

n-kall commented Mar 15, 2024

I've now added tests checking that the column types are as expected, and if the speed is ok, this is ready for review

@mjskay
Copy link
Collaborator

mjskay commented Mar 15, 2024

FWIW, I think the unnesting / flattening might be doable as a cbind on row vectors (or maybe vec_cbind / vec_rbind for safety) to create a 1-row data frame. Dunno if that helps at all.

Copy link

This is how benchmark results would change (along with a 95% confidence interval in relative change) if e175ba8 is merged into master:

  •   :ballot_box_with_check:as_draws_array: 156ms -> 155ms [-1.35%, +0.57%]
  •   :rocket:as_draws_df: 71.3ms -> 68.4ms [-5.59%, -2.54%]
  •   :ballot_box_with_check:as_draws_list: 176ms -> 175ms [-1.7%, +0.47%]
  •   :ballot_box_with_check:as_draws_matrix: 71ms -> 70.5ms [-1.68%, +0.53%]
  •   :ballot_box_with_check:as_draws_rvars: 84ms -> 84.3ms [-0.89%, +1.65%]
  • ❗🐌summarise_draws_100_variables: 724ms -> 733ms [+0.35%, +1.98%]
  • ❗🐌summarise_draws_10_variables: 112ms -> 114ms [+1.05%, +2.91%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@paul-buerkner
Copy link
Collaborator

Speed is okay. Still, is this already the best way to implement the flattening?

@n-kall
Copy link
Collaborator Author

n-kall commented Mar 15, 2024

unnesting / flattening might be doable as a cbind on row vectors (or maybe vec_cbind / vec_rbind for safety) to create a 1-row data frame.

Thanks for the tip! Simplifying this unnesting/flattening would, I think, be beneficial. However, I can't quite get the right structure with rbind or vec_rbind as the quantile2 column ends up with vectors of length 2 (as opposed to separate columns as in unlist). Looks like it would still need some kind of column splitting (e.g. tidyr::unnest_wider)

create_summary_list(example_draws(), 3, list(rhat = rhat_basic, quantile2 = quantile2), NULL) |>
   vec_rbind()

##       rhat            quantile2
## 1 1.014967 -1.231353, 18.874003

create_summary_list(example_draws(), 3, list(rhat = rhat_basic, quantile2 = quantile2), NULL) |>
   unlist()

##         rhat  quantile2.q5 quantile2.q95
##     1.014967     -1.231353     18.874003

@mjskay
Copy link
Collaborator

mjskay commented Mar 15, 2024

Ah right... you might be able to use cbind if you conditionally transpose (or rbind) vectors of length > 1 to turn them into row vectors (not on a computer atm so I can't try it)

@n-kall
Copy link
Collaborator Author

n-kall commented Mar 15, 2024

is this already the best way to implement the flattening

I don't think the issue is so critical that the fix is needed right away, so perhaps it's better to leave this open in case some other ideas come up (e.g. if @mjskay's method works out). However, I won't spend more time on this now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In summarise draws, a function that returns a character changes all columns to character
4 participants